Conversation
|
Caution Review failedThe pull request is closed. WalkthroughReplaces the CCM scanner used for range partitions with Changes
Sequence Diagram(s)sequenceDiagram
participant Worker as Scan Worker
participant SavePoint as BucketScanSavePoint
participant Plan as BucketScanPlan
participant Batch as ScanBatchTxRequest
participant Scanner as CCM Scanner\n(HashParitionCcScanner)
rect rgba(100,150,220,0.06)
Note over Worker,SavePoint: initialize save point & plan
Worker->>SavePoint: create & pin save_point
SavePoint->>Plan: expose plan and currentIndex
end
rect rgba(120,200,150,0.06)
loop while currentIndex < Plan.size()
Worker->>Batch: submit batch (with plan ref & currentIndex)
Batch->>Scanner: scan partition(s)
Scanner-->>Batch: results / progress
Batch-->>Worker: status (progress / yield)
alt progress
Worker->>SavePoint: advance currentIndex
else yield/no progress
Worker->>SavePoint: persist position (yield)
end
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
storage/eloq/ha_eloq.cc (2)
4876-4883: Potential tight loop; clarify ScanBatchTxRequest::Result() semantics and add progress guardIf Result() returns false and no tuples are returned for a plan, current_index never advances, causing a spin. Please confirm that tx side guarantees progress for the same plan (e.g., save_point advances internally). If not, add a guard to break or back off when no progress is observed.
Example minimal guard:
- Track last_seen_count; if scan_batch.empty() for N consecutive iterations with same current_index, break and log.
Also, consider reading prev_pause_idx_ semantics: if it refers to the “next” plan index, you might want to start from prev_pause_idx_ + 1 to avoid re-scanning.
I recommend:
- Document Result() meaning in a comment.
- Add a no-progress safeguard.
Also applies to: 4903-4924
7331-7506: Pushdown type refactor LGTM; small fixes to improve robustness
- Good: Switch to txservice::DataStoreSearchCond and DataStoreDataType aligns with new API.
- Suggest: Expand literal type check to include MYSQL_TYPE_VAR_STRING so varchar constants don’t get skipped.
- Bug: assert(true && "Unsupported cmp function.") never fires. Use assert(false && ...) or remove and handle via default.
Diff sketch:
- pushdown= (val_field->field_type() == MYSQL_TYPE_STRING || - val_field->field_type() == MYSQL_TYPE_VARCHAR) + pushdown= (val_field->field_type() == MYSQL_TYPE_STRING || + val_field->field_type() == MYSQL_TYPE_VARCHAR || + val_field->field_type() == MYSQL_TYPE_VAR_STRING) ? true : false; ... - default: - assert(true && "Unsupported cmp function."); - break; + default: + assert(false && "Unsupported cmp function."); + break;
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
storage/eloq/eloq_catalog_factory.cpp(1 hunks)storage/eloq/ha_eloq.cc(12 hunks)storage/eloq/ha_eloq.h(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
storage/eloq/ha_eloq.h (1)
storage/eloq/ha_eloq.cc (2)
BindPushedCond(7331-7506)BindPushedCond(7331-7331)
🔇 Additional comments (4)
storage/eloq/ha_eloq.h (1)
669-669: Approve namespace simplification
No remaining references tostore::DataStoreSearchCondorstore::DataStoreDataTypefound.storage/eloq/eloq_catalog_factory.cpp (1)
687-687: Verify spelling ofHashParitionCcScanner: No local definition was found in this codebase. Confirm the correct class name in the txservice API (likelyHashPartitionCcScanner) and update this reference at line 687 accordingly.storage/eloq/ha_eloq.cc (2)
119-119: Include sys/types.h is correctNeeded for ssize_t (e.g., readlink usage). No issues.
7509-7572: PushedConditionString enum switch updated correctlyString quoting, blob hex, and numeric passthrough look consistent with new txservice::DataStoreDataType values.
5294331 to
db7af96
Compare
ff944b9 to
d507e74
Compare
0ec099a to
3a3352b
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
storage/eloq/ha_eloq.cc (1)
4870-4895: Fix OOB: PickPlan must occur after bounds checkYou still call save_point->PickPlan(current_index) before verifying current_index < plan_size. If prev_pause_idx_ equals plan_size, this dereferences past end. Move PickPlan inside the while and stop re-picking within the body.
Apply this diff:
- size_t plan_size= save_point->PlanSize(); - BucketScanPlan plan= save_point->PickPlan(current_index); - while (current_index < plan_size) + size_t plan_size= save_point->PlanSize(); + while (current_index < plan_size) { + BucketScanPlan plan= save_point->PickPlan(current_index); scan_batch.clear(); ScanBatchTxRequest scan_batch_req(scan_alias, scan_table_name, &scan_batch, yield_func, resume_func, - txm, -1, {}, &plan); + txm, -1, {}, &plan); txm->Execute(&scan_batch_req); scan_batch_req.Wait(); if (scan_batch_req.IsError()) { ret= HA_ADMIN_FAILED; break; } - - if (scan_batch_req.Result()) - { - current_index++; - if (current_index < plan_size) - { - plan= save_point->PickPlan(current_index); - } - } + if (scan_batch_req.Result()) { + current_index++; + } }
🧹 Nitpick comments (1)
storage/eloq/ha_eloq.cc (1)
7303-7478: Pushdown safety and small robustness tweaks in BindPushedCond
- Skip NULL constants to avoid meaningless “col op NULL” pushdowns.
- For VARCHAR/TEXT with '=' you switch to '>=' only; add an upper bound (“< nextPrefix(val)”) to avoid scanning supersets while keeping correctness.
- Minor: reserve result capacity.
Minimal changes:
std::vector<txservice::DataStoreSearchCond> ha_eloq::BindPushedCond() { + // Reserve to reduce reallocs + std::vector<txservice::DataStoreSearchCond> res; + res.reserve(pushed_conds_.size()); @@ Item_field *col_field= pushed_cond.col_field_; Item_literal *val_field= pushed_cond.val_field_; + // Do not push down comparisons with NULL + if (val_field->null_value) { + continue; + } @@ - switch (func_type) + switch (func_type) { case Item_func::EQ_FUNC: op.append("="); break; @@ } @@ switch (field_type.data_type_) { @@ default: { StringBuffer<MAX_FIELD_WIDTH> str; field->val_str(&str); val_str.append((char *) str.ptr(), str.length()); // VARCHAR/TEXT type if (field_type.data_type_ == EloqDataType::VarString) { // Remove trailing space. size_t endpos= val_str.find_last_not_of(' '); if (endpos != std::string::npos && endpos < val_str.length() - 1) { val_str.replace((endpos + 1), (val_str.length() - endpos - 1), ""); } - // Fetch values from kv without regard for trailing sapces. - if (!op.compare("=")) - { - op.assign(">="); - } + // For '=' push both lower and upper bounds to avoid supersets: + // >= trimmed and < next prefix + if (op == "=") { + // Lower bound (>=) + res.push_back({field_name, std::string(">="), val_str, data_type}); + // Upper bound: next lexicographic prefix + std::string hi = val_str; + hi.push_back('\xFF'); + res.push_back({field_name, std::string("<"), hi, data_type}); + // Restore original field and continue (already pushed both) + field->move_field(table->record[0] + field_offset, + maybe_null ? table->record[0] + null_offset : nullptr, + field->null_bit); + continue; + } } break; } } @@ - res.push_back({field_name, op, val_str, data_type}); + res.push_back({field_name, op, val_str, data_type});Note: nextPrefix via appending 0xFF is acceptable under binary collations used for pushdown here. If non-binary collations are later allowed, use the key-def ceiling helper to compute a safe bound.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
storage/eloq/eloq_catalog_factory.cpp(1 hunks)storage/eloq/ha_eloq.cc(16 hunks)storage/eloq/ha_eloq.h(2 hunks)storage/eloq/store_handler(1 hunks)storage/eloq/tx_service(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- storage/eloq/store_handler
🧰 Additional context used
🧬 Code graph analysis (1)
storage/eloq/ha_eloq.h (1)
storage/eloq/ha_eloq.cc (2)
BindPushedCond(7303-7478)BindPushedCond(7303-7303)
🔇 Additional comments (6)
storage/eloq/tx_service (1)
1-1: Verify submodule pointer updates and provide build fix rationale.Two concerns prevent approval:
Unverifiable commits: The old tx_service commit (0a9fe67a54) returns "not our ref" error, suggesting it may have been deleted or force-pushed. The new commit (eb799bb36cc5169d4961e6e5c94b410fc983c665) cannot be independently verified as existing in the remote repository.
Missing context: The commit message "update submodule" provides no explanation of what specific build issue is being fixed. No code changes in the main repository clarify the rationale.
Before merging, confirm:
- Both submodule commits exist and are stable in their remote repositories
- What build failure this PR resolves and that the fix has been tested
- Whether the "not our ref" error on the old commit indicates intentional history changes
storage/eloq/ha_eloq.h (2)
669-669: LGTM: Namespace change for DataStoreSearchCondThe parameter type update correctly reflects the move of
DataStoreSearchCondfrom thetxservice::store::namespace to the top-leveltxservice::namespace. This change is consistent with the implementation in ha_eloq.cc.
746-746: LGTM: Namespace change for BindPushedCond return typeThe return type update correctly reflects the move of
DataStoreSearchCondfrom thetxservice::store::namespace to the top-leveltxservice::namespace. This aligns with the function implementation and the broader namespace refactoring.storage/eloq/eloq_catalog_factory.cpp (1)
686-692: Verify class name against txservice library documentation or build outputThe class name
HashParitionCcScannerappears to have a spelling inconsistency ("Parition" vs. "Partition"), but the class definition is not found in the accessible codebase—it comes from the externaltxservicelibrary. The typo claim cannot be verified without access to the txservice library source or documentation. If the class name is indeed incorrect, the compilation will fail. Verify the correct spelling in txservice headers or build logs before proceeding.storage/eloq/ha_eloq.cc (2)
119-119: LGTM: portability and guard cleanupssys/types.h include and the conditional LOG/DS macro guards look fine; aws_deinit moved under matching compile-time guards in deinit is correct.
Also applies to: 169-172, 216-219, 226-229, 3304-3307
7481-7544: String/blob rendering for diagnostics looks correctSwitching on txservice::DataStoreDataType with proper escaping/hex formatting is sound. No issues.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
storage/eloq/ha_eloq.cc (1)
7303-7317: Pushdown type migration LGTM; fix default branch to avoid empty operatorType shift to txservice::DataStoreSearchCond/DataStoreDataType looks correct. However, the default in the func_type switch uses assert(true), which never trips and may leave op empty.
Apply:
- default: - assert(true && "Unsupported cmp function."); - break; + default: + DBUG_ASSERT(false && "Unsupported cmp function."); + continue; // skip unsupported comparatorAlso applies to: 7328-7360, 7395-7397
♻️ Duplicate comments (1)
storage/eloq/ha_eloq.cc (1)
4870-4872: Critical: PickPlan before bounds check can read past end (still present)If current_index == plan_size (e.g., prev_pause_idx_ equals PlanSize), PickPlan(current_index) dereferences out of range. Move PickPlan inside the while after verifying bounds. Also drop the mid-loop reassignment.
Apply:
- size_t plan_size= save_point->PlanSize(); - BucketScanPlan plan= save_point->PickPlan(current_index); - while (current_index < plan_size) + size_t plan_size= save_point->PlanSize(); + while (current_index < plan_size) { + BucketScanPlan plan = save_point->PickPlan(current_index); scan_batch.clear(); ... - if (scan_batch_req.Result()) - { - current_index++; - if (current_index < plan_size) - { - plan= save_point->PickPlan(current_index); - } - } + if (scan_batch_req.Result()) { + current_index++; + } }Also applies to: 4889-4893
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
storage/eloq/eloq_catalog_factory.cpp(1 hunks)storage/eloq/ha_eloq.cc(16 hunks)storage/eloq/ha_eloq.h(2 hunks)storage/eloq/store_handler(1 hunks)storage/eloq/tx_service(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- storage/eloq/store_handler
🚧 Files skipped from review as they are similar to previous changes (2)
- storage/eloq/ha_eloq.h
- storage/eloq/eloq_catalog_factory.cpp
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-25T11:41:40.111Z
Learnt from: githubzilla
PR: eloqdata/eloqsql#127
File: storage/eloq/build_log_service.cmake:97-124
Timestamp: 2025-09-25T11:41:40.111Z
Learning: The `storage/eloq/build_log_service.cmake` file is specifically for the open log service build and only supports ROCKSDB as its log state backend. Unlike the enterprise version in `build_eloq_log_service.cmake`, it does not need to handle cloud-specific variants like ROCKSDB_CLOUD_S3 or ROCKSDB_CLOUD_GCS.
Applied to files:
storage/eloq/tx_servicestorage/eloq/ha_eloq.cc
📚 Learning: 2025-09-25T11:33:33.221Z
Learnt from: githubzilla
PR: eloqdata/eloqsql#127
File: concourse/scripts/build_tarball_open.bash:136-136
Timestamp: 2025-09-25T11:33:33.221Z
Learning: The open log service (OPEN_LOG_SERVICE=ON) in concourse/scripts/build_tarball_open.bash only supports ROCKSDB as its log state and does not use the WITH_LOG_STATE parameter that was introduced to replace USE_ROCKSDB_LOG_STATE in other parts of the codebase.
Applied to files:
storage/eloq/ha_eloq.cc
🔇 Additional comments (6)
storage/eloq/tx_service (1)
1-1: Verify submodule compatibility — manual review of commits required.The submodule update cannot be directly verified in the sandbox environment (repository access limitations). However, the parent repository extensively references
txservice::types (30+ occurrences across type definitions, method signatures, and data structures), indicating the update is intentionally aligned with namespace restructuring and type migration mentioned in the PR summary.Recommend manually verifying:
- The commits included in the submodule pointer advancement contain the necessary
txservice::type definitions- No breaking changes to type signatures or public APIs referenced in parent repo (e.g.,
txservice::DataStoreSearchCond,txservice::TransactionExecution)- The
store::DataStoreScannerusage at line 798 ofstorage/eloq/ha_eloq.hremains compatible with the updated submodulestorage/eloq/ha_eloq.cc (5)
119-119: Good fix: add sys/types.h for ssize_tNeeded for readlink/ssize_t. No concerns.
169-172: Cloud DS guard is clearer and saferDefining ELOQDS_RKDB_CLOUD when S3/GCS variants are set improves readability.
Please build once with each DS combo (ROCKSDB_CLOUD_S3, ROCKSDB_CLOUD_GCS, ROCKSDB, ELOQSTORE) to ensure no conflicting defines.
216-219: Deriving LOG_STATE_TYPE_RKDB_ from S3/GCS/RKDB defines*These guards make downstream includes (#if defined(LOG_STATE_TYPE_RKDB_CLOUD/ALL)) robust.
Sanity-check compile with OPEN_LOG_SERVICE=ON/OFF and each LOG_STATE variant to confirm only intended branches activate.
Also applies to: 226-229
3304-3307: Pair aws_deinit with init pathsSymmetric shutdown avoids SDK leaks. Looks good.
7481-7481: Stringification update matches new DataStoreDataTypeCovers String/Blob/Numeric correctly, with escaping/hex handling.
Confirm no callers rely on the old store:: namespace types.
Also applies to: 7498-7535
| } | ||
| } | ||
|
|
||
| scan_batch.clear(); |
There was a problem hiding this comment.
Likely lock leak: clearing scan_batch before ScanCloseTxRequest
You pass scan_batch to ScanCloseTxRequest, but clear it immediately before. This likely drops the last batch’s unlock metadata.
Apply:
- scan_batch.clear();
+ /* keep the last batch for ScanCloseTxRequest to unlock */📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| scan_batch.clear(); | |
| /* keep the last batch for ScanCloseTxRequest to unlock */ |
🤖 Prompt for AI Agents
In storage/eloq/ha_eloq.cc around line 4896, clearing scan_batch before calling
ScanCloseTxRequest risks dropping the last batch’s unlock metadata and causing a
lock leak; move the scan_batch.clear() so it executes after ScanCloseTxRequest
(or otherwise preserve the batch contents until the call completes) so that
ScanCloseTxRequest receives the full batch data before any clearing.
Summary by CodeRabbit
Refactor
Chores